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© In an adaptive subband excited transform 
speech encoding system, a range of quantizers (30) 
are available and are dynamically selected for each 
window of speech. The quantizers (30) designated 



for individual subbands are determined to minimize 
mean squared error distortion in the recreated signal 
while using no more than a predetermined number 
of quantization bits per window of speech. 
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Conventional analog telephone systems are be- 
ing replaced by digital systems. In digital systems, 
the analog signals are sampled at a rate of greater 
than or equal to about twice the bandwidth of the 
analog signals or about eight kilohertz, and the 
samples are then encoded- In a simple pulse code 
modulation system (PCM), each sample is quan- 
tized as one of a discrete set of prechosen values 
and encoded as a digital word which is then trans- 
mitted over the telephone lines. With eight bit digi- 
tal words, for example, the analog sample is quan- 
tized to 2 s or 256 levels, each of which is des- 
ignated by a different eight bit word. Using non- 
linear quantization, excellent quality speech can be 
obtatnod. 

Efforts have been made to reduce the bit rates 
required to encode the speech and obtain a clear 
decoded speech signal at the receiving end of the 
system The linear predictive coding (LPC) tech- 
nique is based on the recognition that speech 
production involves excitation and a filtering pro- 
cess The excitation is determined by the vocal 
cord vibration for voiced speech and by turbulence 
for unvoiced speech, and that actuating signal is 
then modified by the filtering process of vocal 
resonance chambers including the mouth and nasal 
passages For a particular group of samples, a 
digital linear filter which simulates the formant ef- 
fects of the resonance chambers can be defined 
and the definition can be encoded. A residual sig- 
nal which approximates the excitation can then be 
obtained by passing the speech signal through and 
inverse formant filter, and the residua! signal can 
be encoded. Because sufficient information is con- 
tained in the lower-frequency portion of the residual 
spectrum, it is possible to encode only the low 
frequency baseband and still obtain reasonably 
clear speech. At the receiver, a definition of the 
formant filter and the residual baseband are de- 
coded. The baseband is repeated to complete the 
spectrum of the residual signal. By applying the 
decoded filter to the repeated baseband signal, an 
approximation to the initial speech can be recon- 
structed. 

A major problem of the LPC approach is in 
defining the formant filter which must be redefined 
with each window of samples. A problem with such 
systems is that they do not always provide a sat- 
isfactory reconstruction of certain formants such as 
that resulting from nasal resonance. As a result, the 
quality of reconstruction from 16,000 bits per sec- 
ond is generally unsatisfactory. 

Another speech coding scheme which exploits 
the concepts of excitation-filter separation and ex- 
citation baseband transmission is described by Zib- 
man in U.S. patent number 4,914,701. In that ap- 
proach, speech is encoded by first performing a 
Fourier transform of a window of speech. The 



NSDOCID: <EP 0481374A2_l_> 



Fourier transform coefficients are normalized by 
first defining a piece wise-constant approximation of 
the spectral envelope and then scaling the fre- 
quency coefficients relative to the approximation. 

5 The normalization is accomplished first for each 
formant region and then repeated for smaller sub- 
bands. Quantization and transmission of the spec- 
tral envelope approximations amount to transmis- 
sion of a filter definition. Quantization and transmis- 

io sion of the scaled frequency coefficients associated 
with either the lower or upper half of the spectrum 
amounts to transmission of a "baseband" excitation 
signal. At the receiver, the full spectrum of the 
excitation signal is obtained by adding the transmit- 

15 ted baseband to a frequency translated version of 
itself. Frequency translation is performed easily by 
duplicating the scaled Fourier coefficients of the 
baseband into the corresponding higher or lower 
frequency positions. A signal can then be fully 

20 recreated by inverse scaling with the transmitted 
piecewise-constant approximations. This coding ap- 
proach can be very simply implemented and pro- 
vides good quality speech at 16 kilobits per sec- 
ond. However, it performs poorly with non-speech 

25 voice-band data transmission. 

A modification of the Zibman coding technique 
is presented by Mazor, et al. in U.S. Patent 
4,790,016. In that approach, the transform spec- 
trum is divided into a plurality of subbands of 

30 coefficients. The approximate envelope is defined 
for each subband and each envelope definition is 
encoded for transmission. As in the Zibman ap- 
proach, each spectrum coefficient is scaled relative 
to the defined envelope of the respective subband. 

35 In the Mazor, et al. improvement, the number of 
bits to which each coefficient is encoded is deter- 
mined by the defined envelope of its subband. 
Specifically, the four subbands having the largest 
initial peak energy, and thus the largest envelope 

40 definition, are quantized to seven bits for each 
coefficient. The four subbands having the next 
smaller envelope definitions are quantized to six 
bits per coefficient, and the four next smaller sub- 
bands are quantized to four bits per coefficient. 

45 The coefficients of the remaining subbands are not 
transmitted; that is, they were quantized to zero 
bits per coefficient. At the receiver, the transmitted 
subbands are replicated to define coefficients of 
frequencies which are not transmitted. 

so The prior Mazor, et al. system resulted in much 

less than optimal bit allocation with certain types of 
signals. For example, a narrow band signal such as 
a tone might be best treated by allocating all 
available bits to only one or two subbands. In the 

55 prior Mazor, et al. system, bits were allocated to 12 
subbands regardless of the signal, so the bits al- 
located to a number of the subbands would be 
wasted with the narrow band signal. On the other 
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hand, with wide band signals like white noise all 
subbands have about the same signal level, and it 
would therefore be better to evenly distribute all 
available bit across the full spectrum. In the prior 
Mazor, et al. system the bits would always be 
allocated in the same way to 12 subbands. 

The present invention is an improvement to the 
Mazor, et al. approach. As in that approach, a 
discrete transform of a window of speech is per- 
formed to generate a discrete spectrum of coeffi- 
cients. An approximate envelope of the discrete 
spectrum is defined in each of a plurality of subb- 
bands of coefficients. The envelope definition of 
each subband of coefficients is digitally encoded. A 
plurality of quantizers of different bit lengths are 
available for encoding scaled spectrum coefficients 
within each subband. With the present invention, 
the number of subbands applied to each quantizer 
is determined for each window of speech. Thus, 
the quantizers used and the number of subbands 
using the quantizers may be determined for each 
window of speech to minimize distortion due to 
quantization error. 

In the preferred embodiment, the bit allocation 
is determined through an iterative process in which 
distortion is estimated and the bit allocation is 
computed from the estimated distortion. The distor- 
tion is then re-estimated to obtain a bit allocation 
which approaches the number of quantization bits 
available for the window. Specifically, the number 
of bits allocated to each subband is computed as 
the number of bits greater than or equal to zero 
which is approximately equal to and derived from 
log2Pj/D where Pj is a power estimate for the 
subband and D is the expected mean squared 
error distortion for the entire window of speech. 
Preferably, P ; is estimated by table look-up from 
the envelope definition of each subband. 

In the preferred system, where the calculated 
allocation does not equal the number of available 
bits, the system adds bits to the higher energy 
subbands or subtracts bits from the lower energy 
subbands to make use of the full number of bits 
available for the window. 

In the drawings: 

FIG. 1 is a block diagram of a speech encoder 
and corresponding decoder of a coding system 
embodying the present invention. 
FIG. 2 is an example of a magnitude spectrum 
of the Fourier transform of a window of speech 
illustrating principles of the present invention. 
FIG. 3 is an example spectrum normalized from 
that of FIG. 2 based on principles of the present 
invention. 

FIG. 4 schematically illustrates the iterative pro- 
cess for determining the distortion and the bit 
allocation. 

FIG. 5 is a flow chart of the process for deter- 
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mining bit allocation for each window of speech. 
In the accompanying drawings in like reference 
characters refer to the same parts throughout the 
different views. The drawings are not necessarily to 

5 scale, emphasis instead being placed upon illus- 
trating the principles of the invention. 

A block diagram of the coding system is shown 
in FIG. 1 . Prior to compression, the analog speech 
signal is low pass filtered in filter 12 at 3.4 

70 kilohertz, sampled in sampler 14 at a rate of 8 
kifohertz, and digitized using a 12 bit linear analog 
to digital converter 16. It will be recognized that the 
input to the encoder may already be in digital form 
and may require conversion to the code which can 

75 be accepted by the encoder. The digitized speech 
signal, in frames of N samples, is first scaled up in 
a scaler 18 to maximize the numerical resolution in 
the ensuing processing steps. The scaled input 
samples are then Fourier transformed in a fast 

20 Fourier transform device 20 to obtain a correspond- 
ing discrete spectrum represented by (N/2) + 1 
complex frequency coefficients. 

In a specific implementation, the input frame 
size equals 180 samples and corresponds to a 

25 frame every 22.5 milliseconds. However, the dis- 
crete Fourier transform is performed on 192 sam- 
ples, including 12 samples overlapped with the 
previous frame, preceded by trapezoidal windowing 
with a 12 point slope at each end. The resulting 

30 output of the FFT includes 97 complex frequency 
coefficients spaced 41 .667 Hertz apart. 

An example magnitude spectrum of a Fourier 
transform output from FFT 20 is illustrated in FIG. 
2. Although illustrated as a continuous function, it is 

35 recognized that the transform circuit 20 actually 
provides only 97 incremental complex outputs. 

Following the basic approach of Mazor, et al. 
presented in U.S. patent number 4,790,016, the 
magnitude spectrum of the Fourier transform out- 

40 put is equalized and encoded. To that end, the 
spectrum is partitioned into L contiguous subbands 
and a spectral envelope estimate is based on a 
piecewise-constant approximation of those sub- 
bands at 22. In a specific implementation the spec- 

45 trum is divided into twenty subbands, each includ- 
ing four complex coefficients. Frequencies above 
3291.67 Hertz are not encoded and are set to zero 
at the receiver. To equalize the spectrum, the 
spectral envelope of each subband is assumed 

so constant and is defined by the peak magnitude in 
each subband as illustrated by the horizontal lines 
in FIG. 2. Each magnitude, or more correctly the 
inverse thereof, can be treated as a scale factor for 
its respective subband. Each scale factor is quan- 

55 tized in a quantizer 24 to q bits. For example, q 
may equal 4. 

By then multiplying at 26 the magnitude of 
each coefficient of the spectrum by the scale factor 
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associated with that coefficient, the flattened resid- 
ual spectrum of FIG. 3 is obtained. This flattening 
of the spectrum is equivalent to inverse filtering the 
signal based on the piecewise-constant estimate of 
the spectral envelope. 

Only selected subbands of the flattened spec- 
trum of Figure 3 are quantized and transmitted, and 
the selection is based on the spectral envelope of 
the subbands. In accordance with the prior Mazor, 
et al. approach, the number of transmitted sub- 
bands and the bit allocation to the transmitted 
subbands are both predetermined (off-line) and 
kept fixed. Specification of the subbands which are 
transmitted using the predetermined bit allocation 
is adapted to each window of speech. 

In accordance with the present invention, the 
number of subbands to be transmitted and the bit 
allocation to the subbands are recalculated (on-line) 
for each coding frame (speech window) to match 
the signal short-term spectral power distribution. 
The procedure used to calculate the per-frame 
allocation is the dynamic bit-allocation 28. Given 
the frame spectral envelope estimate and the total 
number of available quantization bits, the algorithm 
attempts to minimize the residual coefficients aver- 
age quantization error (in the mean squared-error, 
mse, sense) for that frame. This process automati- 
cally adjusts the bit allocation to match the spectral 
characteristics of the signal and the specified com- 
pression rate. 

Given a total number of available quantization 
bits &r and an a^ial* ^ 

tizing coefficients, the system searches for a bit 

allocation B = (bi, b? b s b L ) where b f is the 

number of bits per coefficient in the i-th subband. 
The available quantizers quantize to a number of 
bits from the range [O, minb, minb + 1, minb + 2,...., 
maxb]. The bit allocation is such that (1) bj is a 
nonnegative integer in the range of available quan- 
tizers, (2) bi + D2 + .... + b L ^ Bj, and (3) the 
average quantization error is minimized. 

The search for optimal allocation is based on 
the unconstrained solution r, = max [O, Log2 - 
(P*/D)] = max[0, LP r LD)], where D is the expected 
mse distortion for the window of speech, r x is a 
nonnegative real number, and LPj and LD are re- 
spectively the log-power and log-mse terms. The 
solutions rj are rounded to the available quantiza- 
tion levels to provide bj. The power of spectrum 

density P = (Pi, P 2 , Pi, P L ) can be derived 

from the spectral envelope coefficients. In accor- 
dance with the present approach, the expected 
distortion D is estimated and the bit allocation 
vector B is calculated from the unconstrained solu- 
tion of n. In an iterative process, the distortion D is 
then re-estimated and the bit allocation vector B is 
recomputed to obtain the bit allocation for which 
the sum B s = bi + b 2 + .... + b L is as close as 



possible to Br. The bit allocation is then fine-tuned 
to make full use, or nearly full use, of the available 
bits Br. 

Although the power density P; could be com- 

5 puted directly from the four coefficients of each 
subband, it is in this case only estimated or de- 
rived at 27 from the i-th subband spectral en- 
velope. From the spectral envelope estimate ob- 
tained at 22, the power density is obtained from a 

10 precalculated conversion table. As a result, no ad- 
ditional side information is needed to transmit the 
power spectrum density. 

The iterative process 28 of determining the 
expected mse distortion and thus computing the bit 

75 allocation vector B from the expected distortion and 
power spectrum density P is illustrated in FIG. 4. 
The system searches for the best value LD using a 
bisecting search. Assume that the expected mse 
distortion which allows for full use of the available 

20 quantization bits is LDexp. The system begins with 
an estimated value LD midway between the maxi- 
mum and minimum possible values for LD. LD max 
and LD mjn are equal to the natural log of the 
maximum and minimum power respectively of the 

25 power spectrum P as specified in the look up table 
or by the nature of a specific signal and bit rate. 
(Pmin should be greater than zero, because log O is 
undefined.) From the first estimate of LD, LD° = 
(LD min + LD max )/2, the bit allocation vector B is 

30 computed from r if where bi is the value n rounded 
to available quantize^l^yejs. B^J^aJ, number of 
bits thus allocated is compared to the number of 
available bits. If the solution indicates that more 
bits may be used, indicating that a lower level of 

35 distortion may be obtained, the value LD is de- 
creased. On the other hand, if the solution indicates 
that more than the available bits Bp have been 
allocated, a greater distortion must be accepted to 
reduce the number of allocated bits. 

40 In the example of FIG. 4, assume that the best 

solution is LDexp. The first solution of B from LD' 
would indicate that too many bits were allocated. 
Therefore, the distortion would be increased from 
(LD max + LD min )/2 to a level LD 1 toward LD max . 

45 Specifically, LD 1 is the point bisecting the interval 
[LD*, LD max ] computed as LD 1 = LD 8 + (LD max - 
LD min )/4. The bit allocation would be recomputed, 
and it would be determined from that new solution 
in this example that additional bits would be avail- 
so able and LD should be decreased. A new distortion 
level LD 2 bisecting the prior two distortion levels 
would then be used to solve again for the bit 
allocation B. 

This iterative process would continue until the 

55 number of allocated bits was equal to, or very near 
to, the available bit allocation Bj. Because the 
solution must be of integer values, the calculations 
may result in an oscillation about the level LD^p. 
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As discussed below, the system avoids such os- 
cillation and fine tunes the bit allocation in a heuris- 
tic fashion. 

FIG. 5 illustrates the dynamic bit allocation 
procedure in greater detail. As noted above, power 
density for each subband is estimated by table 
lookup from the spectral envelope parameters de- 
termined for each subband. For purposes of deter- 
mining the bit allocation, the subbands are then 
ordered in decreasing power density at 50. The 
system is initialized at 52. The initial distortion 
factor log 2 D° = LD° is computed as (LD mjn + 
LD max )/2. An initial distortion increment value do is 
set equal to (LD max - LD mln )/2. Two counter j and k 
for limiting the duration of searches are set at 0. 

In a first loop 54 for performing the search 
illustrated in Fig. 4, j is set equal to j + 1 and dj is 
set equal to d H /2 at 56. To limit the search once 
the value LD is very near to the expected value 
LD^, j is compared to a value j max at 58. If the 
system has reached the maximum value of j, it 
goes to a heuristic fine-tuning step at II. Otherwise, 
the unconstrained solution of r is computed at 60. 
That solution is then rounded at 62 to bj. 

In the preferred available range of quantizer 
values, quantizers from minb equal to 2 bits per 
coefficient through maxb equal to 9 bits per coeffi- 
cient are available. A zero bit quantizer is always 
included in the set of quantizers. Although it is 
preferred that quantizers are available at all integer 
values between minb and maxb, such is not a 
requirement. All solutions n are rounded to the 
nearest available quantizer value. Specifically, in 
the present implementation if n is computed to be 
.8, that value is rounded to b = 0 since a 1 bit 
quantizer is not available. Similarly, if x x is com- 
puted to be 1.1, it is rounded to bj = 2, the nearest 
available quantizer level. 

At 64, the sum of the bit allocations B s is 
computed, as are the difference between the total 
available bits By and B s and the absolute value of 
that difference. If it is determined at 68 that Bj is 
equal to Bs, the system has the solution and stops. 
If the value is greater than zero, indicating that 
additional bits are available and the distortion level 
can be decreased, LD is set equal to LD - dj at 70 
and the system returns to the beginning of loop 54. 
On the other hand, if the value Bj - Bs is less than 
zero, indicating that too many bits were allocated 
and that a greater distortion must be accepted, LD 
is set equal to LD + dj and the system goes to the 
beginning of the loop 54. 

Once the system has processed through the 
loop 54 to the point where j is greater than jmax at 
58, a heuristic routine II is begun. From the prior 
loops at 54, the LD associated with the smallest 
value of the absolute value of Bj - Bs is selected at 
74 as the nearest solution. The solution of the B is 
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then calculated again at 76 and rounded at 78. Bs 
and Br - Bs are calculated at 80. Counters j and i 
are set to zero and L + 1 respectively at 81 . IF Br 

- B s is greater than zero, indicating that additional 
5 bits are available, one bit is added to the preroun- 

ded value rj beginning with the highest energy 
subband where j equals 1 . To that end, counter j is 
incremented at 84. One bit is added to rj at 85. The 
new solution is then rounded at 86 and Bs and Br - 
io B s are computed at 88. If Br - B s is then equal to 
zero the search stops. If it is still greater than zero 
the system loops back to 84 to increment the next 
value r^ 

It is possible for the addition of 1 bit at 85 to 
75 increase the number of rounded bits by 2. For 
example, if the prerounded ^ had previously been 
0.6, resulting in zero bits being allocated for that 
subband, the addition of one bit results in an un- 
rounded value of 1.6 which results in two bits for 
20 that subband. If the addition of the bit results in &r 

- B s being less than zero, the value k is incre- 
mented at 92 and compared to k max at 94. If k has 
not reached the maximum value the system moves 
to a decrementing loop beginning at 96. to avoid 

25 oscillation between incrementing and decremen- 
ting, once the value k max is reached, the system 
goes to a final step III. 

If Br - B s is found to be less than zero at either 
82 or 90, bits must be decremented. At 96 the 

30 counter i is decremented. Decrementing of bits at 
98 is then prerounded r t beginning with i = L, the 
lowest energy subband. The solution is rounded at 
99, and B s and Bj -B s are recomputed at 100. So 
long as &r - Bs is found to be less than zero at 

35 either 82 or 102, the next lowest energy value n is 
decremented at 98. If Br - B s is found to be equal 
to zero at 102, the allocation is stopped. If it is 
found to be less than zero, k is incremented at 104 
and compared to k^ at 106. Again, once k max is 

40 reached the system goes to routine III. Otherwise 
the system goes to the incrementing step 84. 

If the system fails to provide a B s equal to Br 
in step II, routine III is followed. A counter n is set 
at L + 1 at 108 and decremented at 110. If it is 

45 determined at 112 that Br is greater than Bs the 
system stops. One or more available bits may not 
be used. If the number of allocated bits is greater 
than &r, the system sets the lower energy subband 
bits b n to zero at 114 beginning with n equal to L 

50 until Br - B s £ O. 

As an alternative to incrementing and de- 
crementing by power level, the system could incre- 
ment or decrement subbands by order of frequen- 
cy. Thus, for example, the additional bits in routine 

55 II could be added to the low frequency subbands 
and subtracted from the high frequency subbands. 

From the computed bit allocation B, appro- 
priate quantizers are applied to coefficients of se- 

5 
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lected subbands of the residual spectrum at 30. In 
a preferred implementation, the 2D quantizers 
range from minb = 2 bits per coefficient to maxb 
= 9 bits per coefficient, and each of these quan- 
tizers is designed using an approach presented by 
Linde, et aL, M An Algorithm for Vector Quantizer 
Design," IEEE trans on Commun. Vol. COM 28, pp. 
84-95, January 1980. 

At the receiver, the spectral envelope estimate 
which is transmitted in frequency sequence is used 
to recalculate at 32 the signal power spectrum 
density P = (Pi, P2, ..... P|_)- The power spectrum 
density is used to determine at 34 the correspond- 
ing bit allocation which, in turn, controls the inverse 
quantization process 36 as well as the spectral 
replication process 38. As in the prior Mazor, et al. 
approach, the coefficients of the subbands which 
are not transmitted are approximated by replication 
at 38 of transmitted subbands. Once the equalized 
spectrum of Figure 3 is recreated by replication of 
subbands, a reproduction of the spectrum of Figure 
2 can be generated at 40 by applying the transmit- 
ted scale factors to the equalized spectrum. From 
that reproduction of the original Fourier transform, 
the speech can be obtained through an inverse 
FFT 42, an inverse scaler 46, a digital to analog 
converter and a reconstruction filter 48. 

The important benefits associated with the dy- 
namic bit allocation coding technique are (1) the 
coder is automatically optimized to match the 
specified coding rate, and (2) the coder's bit alloca- 
* tion ™ is- dynamieally ^^adjust^^e^fack th^ ; signal w 
short term spectrum. With these added capabilities 
the coder handles more effectively than the prior 
coder non-stationary signals like speech, non- 
speech signals like voiceband data, and variable 
rate coding. 

The coding technique provides for excellent 
speech coding and reproduction at rates as low as 
9.6kb/s. In terms of signal to noise ratio (SNR), the 
algorithm with the above specific implementation, 
outperforms the prior Mazor, et al. algorithm by 
over 2 dB at 16 kb/s and by more than 1 dB at 9.6 
kb/s. The improvement in performance is even 
more dramatic when voiceband data and DTMF 
(dual tone multifrequency) signals are considered. 
This coder handles very well up to 2400-b/s data at 
16-kb/s compression rate and up to 1200-b/s data 
at 12-kb/s compression rate. On the other hand, the 
prior coder can only handle up to 1200-b/s data at 
16-kb/s compression rate. In terms of SNR, this 
coder outperforms the prior coder by over 5 dB at 
16-kb/s and 3dB at 12 kb/s for voiceband data, and 
by over 6 dB at 16 kb/s as well as at 12 kb/s for 
DTMF signals. 

While this invention has been particularly 
shown and described with reference to preferred 
embodiments thereof, it will be understood by 
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those skilled in the art that various changes in form 
and details may be made therein without departing 
from the spirit and scope of the invention as de- 
fined by the appended claims. 

5 For example, transforms other than the Fourier 

Transform, such as the cosine transform, may be 
used. Also, the specific choice of available quan- 
tizers can vary. The encoding and decoding may 
be performed by dedicated hardware or in a soft- 

10 ware controlled system. 

Claims 

1. A speech coding system comprising: 
75 transform means for performing a discrete 

transform of a window of speech to generate a 
discrete transform spectrum of coefficients; 

envelope defining and encoding means for 
defining an approximate envelope of the dis- 
20 crete spectrum in each of a plurality of sub- 

bands of coefficients and for encoding the de- 
fined envelope of each subband of coefficients; 

means for scaling each spectrum coeffi- 
cient relative to the defined envelope of the 
25 respective subband of coefficients; 

a plurality of quantizers of different bit 
lengths for encoding scaled spectrum coeffi- 
cients within each subband; and 

means for determining a quantizer, if any, 
30 to be used to encode the coefficients of each 

subband within each window of speech 
wner^^ 

used in successive windows is variable. 

35 2. A speech coding system as claimed in Claim 1 
wherein the means for determining allocates 
bits to minimize quantization error for each 
window of speech. 

40 3. A speech coding system as claimed in Claim 1 
wherein the means for determining iteratively 
estimates distortion and computes for each 
distortion estimate a bit allocation across the 
spectrum to near fully allocate a predeter- 

45 mined number of bits. 

4. A speech coding system as claimed in Claim 3 
wherein 

the means for determining computes the 
50 allocation of bits as a number of bits greater 

than or equal to zero which is derived from 
log2(P/D) for each subband where Pj is a pow- 
er density estimate for the subband and D is a 
distortion error estimate for the window of 
55 speech. 

5. A speech coding system as claimed in Claim 4 
wherein Pj is estimated for each subband from 

6 



11 
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the defined envelope of the respective sub- 
band of coefficients. 

6. A speech coding system as claimed in Claim 4 
wherein the means for determining tunes the 5 
computed bit allocation toward the number of 
available bits by adding bits to higher energy 
subbands or subtracting bits from lower energy 
subbands. 

10 

7. A speech coding system as claimed in Claim 6 
wherein Pj is estimated for each subband from 
the defined envelope of the respective sub- 
band of coefficients. 

75 

8. A speech coding system as claimed in Claim 1 
wherein the transform means performs a dis- 
crete Fourier transform. 



cients. 

14. A method as claimed in claim 12 wherein the 
allocation of bits is tuned toward the number of 
available bits by adding bits to higher energy 
subbands and subtracting bits from lower en- 
ergy subbands. 

15. A method as claimed in Claim 6 wherein Pj is 
estimated for each subband from the defined 
envelope of the respective subband of coeffi- 
cients. 

16. A method as claimed in Claim 1 wherein the 
transform is a discrete Fourier transform. 



9. A method of coding speech comprising: 20 

performing a discrete transform of a win- 
dow of speech to generate a discrete spectrum 
of coefficients; 

defining an approximate envelope of the 
discrete spectrum in each of a plurality of 25 
subbands of coefficients and digitally encoding 
the defined envelope of each subband of co- 
efficients; and 

with at least one of a plurality of quantizers 
of different bit lengths, encoding each of 30 
scaled coefficients within subbands into a 
number of bits, the quantizer used for each 
subband being determined for each window of 
speech to minimize distortion. 

35 

10. A method as claimed in Claim 9 wherein the 
quantizers are determined to minimize quan- 
tization error for each window of speech. 

11. A method as claimed in Claim 9 wherein the 40 
quantizers are determined by iteratively es- 
timating distortion and computing for each dis- 
tortion estimate a bit allocation across the 
spectrum to nearly fully allocate a predeter- 
mined number of bits. 45 



12. A method as claimed in Claim 11 wherein the 
quantizers are determined by computing the 
allocation of bits as a number of bits greater 
than or equal to zero which is derived from 50 
log2 (P/D) for each subband where Pj is a 
power density estimate for the subband and D 

is a distortion error estimate for the window of 
speech. 

55 

13. A method as claimed in Claim 12 wherein Pj is 
estimated for each subband from the defined 
envelope of the respective subband of coeffi- 
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